Mining TCGA Data Using Boolean Implications

نویسندگان

  • Subarna Sinha
  • Emily K. Tsang
  • Haoyang Zeng
  • Michela Meister
  • David L. Dill
چکیده

Boolean implications (if-then rules) provide a conceptually simple, uniform and highly scalable way to find associations between pairs of random variables. In this paper, we propose to use Boolean implications to find relationships between variables of different data types (mutation, copy number alteration, DNA methylation and gene expression) from the glioblastoma (GBM) and ovarian serous cystadenoma (OV) data sets from The Cancer Genome Atlas (TCGA). We find hundreds of thousands of Boolean implications from these data sets. A direct comparison of the relationships found by Boolean implications and those found by commonly used methods for mining associations show that existing methods would miss relationships found by Boolean implications. Furthermore, many relationships exposed by Boolean implications reflect important aspects of cancer biology. Examples of our findings include cis relationships between copy number alteration, DNA methylation and expression of genes, a new hierarchy of mutations and recurrent copy number alterations, loss-of-heterozygosity of well-known tumor suppressors, and the hypermethylation phenotype associated with IDH1 mutations in GBM. The Boolean implication results used in the paper can be accessed at http://crookneck.stanford.edu/microarray/TCGANetworks/.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mining Large Heterogeneous Cancer Data Sets Using Boolean Implications

Boolean implications (if-then rules) provide a conceptually simple, uniform and highly scalable way to find associations between pairs of random variables. In this paper, we describe their usage in mining associations from large, heterogeneous cancer data sets. Next, we illustrate how Boolean implications were used to discover a new causal association between a mutation and aberrant DNA hyperme...

متن کامل

Some properties of evaluated implications used in knowledge-based systems and data-mining

The core of expert knowledge is typically represented by a set of rules (implications) assigned with weights specifying their (un)certainties. The task of inference mechanism in such rulebased expert systems can be analyzed from the many-valued (fuzzy) logic perspective. On the other hand, implicational relations between two Boolean attributes derived from data (association rules) are quantifie...

متن کامل

A Theoretical Framework for Association Mining Based on the Boolean Retrieval Model

Data mining has been defined as the nontrivial extraction of implicit, previously unknown and potentially useful information from data. Association mining is one of the important sub-fields in data mining, where rules that imply certain association relationships among a set of items in a transaction database are discovered. The efforts of most researchers focus on discovering rules in the form ...

متن کامل

A Novel Boolean Algebraic Framework for Association and Pattern Mining

Data mining has been defined as the nontrivial extraction of implicit, previously unknown and potentially useful information from data. Association mining and sequential mining analysis are considered as crucial components of strategic control over a broad variety of disciplines in business, science and engineering. Association mining is one of the important sub-fields in data mining, where rul...

متن کامل

Using Efficient Boolean Algorithms for Mining Association Rules

In this paper, we use transaction data as the source data of mining, and each transaction data contains a consumer ever buy items. We mine association rules from two aspects. One is to present a Boolean FP-tree algorithm to mine association rules with the Boolean computation according to the FP-tree algorithm and CDAR algorithm. The experiments show that the performances of our algorithm are fa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 9  شماره 

صفحات  -

تاریخ انتشار 2014